Minimal Network Coding for Multicast 



Kapil Bhattad*, Niranjan Ratnakar^, Ralf Koetter^, and Krishna R. Narayanan* 
*Texas A & M University, College Station, TX. Email: kbhattad,krn@ee.tamu.edu 
^University of Illinois, Urbana Champaign, IL. Email: ratnakar,koetter@ uiuc.edu 



Abstract — We give an information flow interpretation for 
multicasting using network coding. This generalizes the fluid 
model used to represent flows to a single receiver. Using the 
generalized model, we present a decentralized algorithm to 
minimize the number of packets that undergo network coding. 
We also propose a decentralized algorithm to construct capacity 
achieving multicast codes when the processing at some nodes is 
restricted to routing. The proposed algorithms can be coupled 
with existing decentralized schemes to achieve minimum cost 
muticast. 

I. Introduction 

In their seminal work, Ahlswede et al [1] showed that if the 
nodes in the network are allowed to perform network coding 
rather than just routing then the max flow min cut bound on the 
multicast capacity is achievable. Li et al [2] showed that linear 
codes are sufficient to achieve the multicast capacity. Since 
then several techniques have been proposed to design codes 
that achieve the multicast capacity. Among them, the idea of 
random network coding seems very promising. Ho et al [3], 
[4] propose a scheme in which data is collected in the form 
of packets of, say, length n. These packets are then treated as 
elements of a finite field of size q = 2" (assuming that the data 
is in bits) and they show that if the messages on the outgoing 
edges of every node are set to be a random linear combination 
of the messages received along the incoming edges over a 
finite field of size q then the probability that the resulting code 
is not a valid multicast code is 0(l/q). (We call a multicast 
valid if the destination nodes can decode the data.) Therefore a 
valid multicast code can be designed with very high probability 
by random coding over a large field. 

Random network coding by itself could be inefficient in 
terms of network resources. Since the scheme is completely 
distributed and there is no communication between the nodes, 
each node sends messages on all its outgoing edges in the 
process using up all the available bandwidth. But, this problem 
can be solved. In [5], [6] Lun et al proposed a distributed 
algorithm which can be used, for example, to find a sub 
network that minimizes link usage costs while having the same 
multicast capacity as the given network. Random network 
coding can be employed on this sub network to achieve the 
multicast capacity. 

In general, minimal cost network coding solutions are of 
practical interest. The cost to be minimized may depend 
on the network and application at hand. For example, if a 
router that employs network coding is expensive we will 
want to minimize the number of nodes that perform network 
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coding. In optical networks the operation of computing linear 
combination of inputs may require conversion from optical 
signals to electrical signals which is expensive and hence we 
may want to minimize the number of packets that undergo 
network coding. Random network coding as such would result 
in schemes where every node performs network coding. In 
this paper, we will address the problem of minimal cost 
network coding where the cost is the number of packets that 
need to be network coded. We also consider the problem of 
finding minimum cost solutions when some of the nodes are 
restricted to perform only routing. The multicast capacity for 
a special case of this problem when all the nodes only route 
has been studied in [7]. We will refer to nodes employing 
network coding by network coding nodes and nodes restricted 
to routing by routing nodes. 

In [5], the authors consider costs such as bandwidth and 
delay and investigate minimum cost multicast. However, the 
results in [5] cannot be directly used to solve the problems 
considered in this paper because the fluid model used to 
represent flows to individual receivers cannot be used when 
some of the nodes are restricted to routing. It is also not 
possible to differentiate between the operations of network 
coding and routing at a node by only looking at the input 
and output flows of that node. The main contribution of this 
paper is to give a new information flow based interpretation 
for the multicast flow and use this model to set up optimization 
problems that can be solved in a distributed manner. 

The optimization problem formulated in this paper has 
a complexity that grows exponentially with the number of 
receivers but in many applications like video conferencing the 
number of receivers is quite small and hence these algorithms 
can be of practical use. 

In section [H] we give the notation used in the paper. We 
present the new information flow model in section [Hi] In 
section II VI we set up the optimization problems and finally 
conclude in section IV1 

II. Notation 

We represent a network by a directed graph Q — (V,£), 
where V is the set of vertices (nodes) and £ is the set of 
edges (links). The capacity of edge e 6 £ is given by C(e). 
For each node v 6 V we define sets Ei n (v) and E out (v) as 
the set of all edges that come into v and that go out of v 
respectively. 

We consider a multicast problem with one source S 6 V 
and K receivers in the set D C V. We assume that 
D = {1, 2, • • • , K}. For convenience we define two sets 
V and Q where V is the power set of D (neglecting the 



empty set) and Q is a set containing all collections of two 
or more disjoint sets in V . For example, when K = 3, 
V = {{1} 1 {2}, {3}, {1,2}, {1,3}, {2, 3}, {1,2, 3}} and Q = 
{ {{1}, {2}}, {{1}, {3}}, {{2}, {3}}, {{1}, {2}, {3}}, {{1}, 
{2,3}}, {{2}, {1,3}}, {{3}, {1,2}} }. We fix the ordering 
in V and Q and represent the z-th element in V and the j-th 
element in Q by p and Qj respectively. 

With each edge e £ £ and a coding scheme we associate a 
2 K — 1 length information flow vector X e where the i-th ele- 
ment denoted by x e (Pi) represents the amount of information 
common to and only common to receivers in the set P; that 
flows through the edge e. We define Ik(X e ) = 



■i-.kePi 



as the amount of flow along edge e in the flow decomposition 
of receiver k. The definitions will be made precise in Section 

M 

It is sometimes convenient to assume that the edges ca- 
pacities and the flow vectors are integers. This assumption 
is justified since we can always consider the network over 
multiple time instances. 

III. Information Flow 

In a multicast setup, any multicast solution can be decom- 
posed into flows to individual receivers [1], [2]. The flows to 
different receivers could overlap. Overlapping flows indicates 
that the data sent along the overlapping part of the flow has 
to be eventually conveyed to all the receivers whose flows 
overlap. 

The main idea here is to partition the flows to the individual 
receivers as components of the form x e (Pi). We formally do 
this in the rest of the section. We define x e ({ki, fe, • • • , kj}) 
for an edge e as the amount of overlap in the flows along e 
from the source node to the receiver nodes hi, fe, • • • , kj and 
that does not overlap with any other flow for any other receiver. 
To identify the overlapping flows, consider a network obtained 
by expanding the original network by replacing each edge e by 
parallel edges, e[, ■ ■ ■ , e 'c{ e y °f un ^ ca P a city (assuming edge 
capacities are integers). The expanded network also supports 
the same rate (h, also assumed to be an integer) as the original 
network and hence h edge disjoint paths from source to 
receiver k for each k can be found [1], [2]. The paths to the 
different receivers could have overlapping edges. For an edge 
e' in the expanded network, let Pj be the set of all receivers 
that have edge e' in one of their paths. The element x e > (P) 
in the information flow vector for e' is then 1 for P = Pi 
and otherwise. If no paths pass through e' its information 
flow vector is zero. The information flow vector for the edge 
e in the original network is the sum of the information flow 
vectors of the parallel edges e[, ■ ■ ■ , e'c(e)- 

To keep the notation brief we also use x e (ki, fe, • • • , kj) 
with k% < &2 < • • ■ < kj to represent x e ({ki, k2, • • • , kj}). 
It is east to see that the flow to receiver k along edge e is 
given by Yli-keP x e{Pi)- Since this is a function of X e for 
each k, we represent it by Ik{X e ). We show the flow vector 
and information flow vector for some multicast networks in 
Example 1. 

Example 1: Consider the network shown in Fig. Q}u A 
code that achieves the multicast capacity is shown in Fig. 
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Fig. 1. Multicast Flows 
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The flows to the two receivers are shown in Fig. \T]p. In 
Fig. ^ tne information flow vector for each edge is shown. 
The information flow vector is (1,0,0) when the edge carries 
data at unit rate only for receiver 1, is (0,1,0) when the edge 
carries data at unit rate for receiver 2 and (0,0,1) when the 
edge carries data at unit rate meant for both the receivers. In 
Fig- El we show a code over two time instances that achieves 
the routing capacity of the network [7] and in Fig.^ we show 
the corresponding flows. We note that the edge between node 
4 and node 3 has flows for both the receivers but they are not 
overlapping flows. It is easy to verify in Fig.^i,^,^, and^ 
that Ii(X e ) and I2 (X e ) gives the amount of flow along edge 
e to receiver 1 and 2 respectively. 

The amount of data flowing along an edge e is the sum of 
the elements of X e . Since each edge has a capacity constraint, 
we have the following constraint on X e . 

sumpf e ) = b T X e < C(e) V e e £ (1) 

where b is the all one vector of length 2 —1. We denote the 
constraint in ([Q as the edge constraint. 

Theorem 1: In any multicast network that supports a rate 
h, we can find X e for every edge e G £ satisfying the edge 
constraint such that 
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Proof: It is possible to decompose any multicast code 
into h flows to individual receivers [1]. Consider the X e 's 
and ife(X e )'s corresponding to one such flow decomposition. 
Ik{X e ) is the amount of flow along edge e in the flow 
decomposition of receiver k. The equations in (0 claim that 
in the flow decomposition of receiver k, the flow coming into 
any intermediate node is equal to the flow coming out of the 



node, the flow coming out of the source node is h, and, the 
flow into receiver k is h. These are well known properties of 
the flow decomposition [1]. ■ 
It is convenient to define X^ n and X v out for node v e V as 
J2eeE tn (,v) X e and J2eeE out (v) X e respectively. To keep the 
notation brief we will drop the superscript v in discussions 
involving just one node. Since / is a linear function of X, the 
conditions in Q reduce to 

I k {X v m ) = I k {x: ut ) Vfc, Vv e V - {S,k} 
h{X s out ) = I k {Xt) = h Vfc (3) 

We will call the necessary conditions in (0 as flow constraints. 
In Fig. ^ we can easily see that the edge and the flow 
constraints are satisfied. 

A. Routing, Replicating and Network Coding 

Let us take a closer look at the different operations that 
occur in a node in a multicast network. In Fig. [2 we see 
that there are three different operations that happen at a node. 
The first and simplest operation is when a packet is routed 
to one of the output edges. The second type of operation is 
replication in which multiple copies of the packet are sent 
along different edges. The third operation, network coding, 
refers to the case when two or more packets are combined 
into one packet. We will see that these three operations are 
sufficient to represent any necessary processing being done 
at the node but before that we need to understand what the 
different operations represent. 

We first look at routing and replication. Each packet that 
comes into a node has an associated set of receivers Q C D. 
The packet has to eventually reach each node in Q. When it 
gets routed onto one of the output edges, the packet on the 
output edge still has to reach all nodes in Q. In terms of the 
information flows to the various receivers, this corresponds 
to the case when overlapping flows or a simple flow passes 
through a node and continues unaffected. 

When a packet gets replicated, then each copy of the packet 
on the output edge has to reach nodes in Pj a subset of Q. 
(Pi has to be a subset of Q since the packet has to reach 
only nodes in Q.) Since the packet has to reach all nodes in 
Q we have Up = Q. Moreover, the same packet does not 
need to reach the same destination along two different paths. 
Therefore the P/s are disjoint Pi 1 n Pi 2 = <fr Vii ^ ii. In the 
flow decomposition replication corresponds to the point where 
two or more overlapping flows diverge. 

For example, consider the node 3 in Fig. [2 The incoming 
packet has to be sent to both node 1 and node 2. Xf n — 
[0,0, 1]. The node replicates it and forwards it to two edges. 
Along one of the edges that packet reaches node 1 (X^^ = 
[1,0,0]) and the packet sent on the other edge is meant for 
node 2 (X e{3>2) = [0, 1, 0]). At the output X Q 3 ut = [1, 1, 0]. 

This concept becomes clearer when we look at the rela- 
tionship between Xi n and X out . We consider the case for 
two and three destinations and then generalize the results. 
When there are two destinations replication occurs only when 
a packet meant for both destinations is replicated and sent on 
two different paths, one path for each receiver node. 2) 



represents the average number of packets coming in per unit 
time that need to go to both 1 and 2. If r(r > 0) of these 
packets are duplicated and transmitted per unit time we have 

aw(l) = x in (l) + r 
x out (2) = x m (2) + r (4) 
x out (l,2) = x in (l,2)-r 

Now consider the case with three destinations. Similar to 
the two receiver case, a packet meant for two destinations can 
get replicated to produce two packets for the two destinations. 
Let r\, Ti and r% represent the amount of replication cor- 
responding to flows to receiver sets {{1},{2}}, {{1},{3}}, 
and {{2}, {3}} respectively. When packets meant for all three 
receivers replicate, they split the flow in four possible ways 
{{1},{2},{3}}, {{1},{2,3}}, {{2},{1,3}} and {{3},{1,2}}. 
Let r4, rs, r$ and r*t represent the number of packets replicated 
per unit time corresponding to the four cases. The relation 
between the X- m and X out is therefore given by 
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We note that each of the r's are > 0. Moreover, if all the 
r's equal then only routing is performed at a node. In the 
general case we will have a routing variable r 3 associated 
with every set Qj corresponding to flow for receivers in the 
set Uq 6 q Q being replicated with each copy meant for a set 
in Qj. We denote the set of routing variables r/s by R. The 
general equation is 

Xout(Pi) = Xi n (Pi) + ^ T J '~~ X] r 3 ^ 

j-.Pi&Qj r-UQeQ J Q=Pi 

Any node that is restricted to routing/replicating has to 
satisfy (|6j. We will call this constraint on Xi n and X ou t as 
routing constraint. Note that although we call the variable rj 's 
as routing variables they actually correspond to replication. 
Also when we say a node is a routing node we allow for 
replication at that node. 

The third type of operation is network coding. This happens 
at nodes where two or more flows merge. Similar to the routing 
variables we define a set of network coding variables N where 
element rij represents the amount of flow meant for each set 
of receivers Q e Qj that merges to form one rij flow that has 
to reach all receivers in the set Uq^q.Q. Uj\Qj\ packets are 
network coded to form rij packets. It is easy to see that for a 
network coding node the relationship between Xi„ and X ou t 
has to be of the form 

Xout(Pi) — Xi n (^Pi) -\- ^ Tj ^ Tj 



which reduces to 

Xout (Pi) = X m (Pi}+ Y fa ~ n i) _ fa ~ "J ) 

(8) 

We note that it is sufficient to consider variables — rij 
but we retain both for now. We will refer to the conditions in 
(|8j as node constraints. 

In the following theorem, for any pair of Xi n and X ou t that 
satisfy the flow constraints, we show that the operations at the 
node can be decomposed into routing, replicating and network 
coding operations and hence these operations are sufficient to 
represent any processing done at the node. 

Theorem 2: The relationship between Xi n and X out for 
any valid operation at the node can be expressed in terms of 
routing variables R and network coding variables N such that 
each element of R and N is > . 

Proof: We will give a particular solution satisfying all the 
conditions. The main idea used in constructing the particular 
solution is that all packets meant for more than one receiver 
can be replicated to produce packets such that each packet 
is meant for one receiver. They can then be suitably network 
coded to get the desired output information flow vector. 

Consider a set of receivers p and corresponding set Q(Pi) 
the set of all singleton subsets of Pj. For every set Pj G V 
containing two or more elements set rj = Xi n (Pi) and 
rij = x out (Pi) where Qj = Q(Pi). Set all other routing 
and network coding variables to 0. We will show that this 
solution satisfies the constraints in l|8). On substituting for the 
routing and network coding variables that have been set to 0, 
for all non singleton sets Pj the constraints in (|8) reduce to 
Xout {Pi) = x in {Pi) - rj + rij, Qj = Q(Pi) which is satisfied 
by choice of rj and rij. For singleton sets P = {k} we have 

Xout(k) = X in (k) + 22 ( r ]- n j) 

j-.{k}eQ 3 

= x m (k)+ Y ( r j~ n j) 

i:{k}eQ ]= Q(P z ),\Pi\>l 

— Xini}^) ~t~ ^ ^m(Pt) X ut{,Pi) 

i:k£Pi,\Pi\>l 

which is exactly the flow constraint on information flow to the 
receiver k (Eq. |3J and hence is satisfied. ■ 

Theorem 3: Given a network Q = (V, £ ), flow vectors X e 
for each edge e £ £ and routing and network coding variables 
R v and N v for each node v £ V such that the edge, flow 
and node constraints are satisfied, we can construct a valid 
multicast code that performs routing and network coding as 
specified by R° and N v . 

Proof: We prove the theorem by replacing each node 
in the network by a network that has routing and network 
coding nodes corresponding to the variables R v and N v such 
that there is no loss in the multicast rate. 

With every node v £ V associate a set of (2 fc — 1) 
nodes where each new node, w(Pj), corresponds to one set 
of receivers P, £ V. For every set Pj £ V connect all the 
x v in (Pi) incoming edges and the x v out {Pi) outgoing edges of 
node v carrying data for receivers in and only in set Pj as 
input and output edges to the node v(Pi). 



Corresponding to each non zero routing variable rj con- 
struct r V j nodes, each node having exactly one incoming edge 
coming from node v^p^Q^Pi) and \Qj\ outgoing edges that 
are connected as inputs to nodes in {v(P;) : P L G Q\}. 
Corresponding to each non zero network coding variable rij 
construct nodes with each node having one input edge from 
every node in {v(Pi) : Pi £ Qj} and one output edge that is 
connected as input to node ^(Up^Q^Pj). 

Now the number of incoming edges to node v(Pi) 

is xUP) + y,n <j r j + E r .u Q e Qj Q=P, n "j and the 
number of outgoing edges is x v out (Pi) + V, :/ , //',' + 
Ylj-u e q=p r J- From (|8j the number of incoming edges 
is equal to the number of outgoing edges. Randomly connect 
the set of input edges and the set of output edges of node 
v(Pi) in a one to one manner and delete node v(Pi). 

It is easy to see that this construction procedure replaces 
each node by a network that maintains the same flows and 
hence there is no loss in rate. ■ 

In the construction procedure provided in the proof for 
Theorem |3] the network that replaces each node could have 
cycles. These cycles are formed when a packet meant for a set 
of receivers Pi goes through a series of network coding and 
routing operations to get back a packet meant for Pi itself. 
Clearly the involvement of this packet in those operations is 
unnecessary. All cycles correspond to unnecessary operations 
and hence can be removed. We note that cycles within a 
node will be absent in solutions that minimizes the number 
of network coding operations. The construction procedure 
provided can be used along with ideas of random network 
coding [3], [4] to construct multicast codes corresponding to 
the given information flow vectors. 

IV. Optimization 

Since any solution to the set of linear equations specified 
by 0, <E} and l|8) corresponds to a network coding solution, 
we can use the set of equations to obtain a network coding 
solution in order to minimize a "cost" associated with the 
network code. The problem can be stated as follows: 

minimize Cost 
subject to 

x e {Pi) > V Pj £ P, V e £ £ , 
r v j > 0, n} > V i V v £ V 
Edge Constraints: 

x e(Pi) < C(e) Ve £ £ 

Pi&P 
Node Constraints: 

xl vt {P i ) = x1 n {P i )+ Y (rj-n v j)- 
r-Pi£Qi 

Y (fj ~ n]) V Pi £ V V v £ V 
h(X s out ) = h Vfc 

I k {Xt) = hVk (9) 



where 

xUPi) = E x out(Pi) = E x *( p i) 

e£Ei„(v) e£E„„((ti) 

and/ fc (X e ) = £ x e (Pj) 

Note that we have dropped some of the flow constraints in 
as they are satisfied automatically if the node constraints in 
(|8j are satisfied. 

In the remainder of the section, we list a few natural cost 
criteria. 

1) Number of Network Coding nodes. Since additional 
coding capabilities are required at a node in order to 
perform network coding, it is potentially of interest 
to minimize the number of nodes performing network 
coding. Using Theorem[3] it follows that network coding 
needs to be performed at a node v only if n\ > for 
some i. Since n\ > 0, this condition is equivalent to 
J2i n i > 0. Thus, the number of nodes in the network 
performing network coding is Y^veV ^(X/i n i > 
which we choose as the cost function. 

However, note that for n% > 0, the function 
J2 V £V n i > 0) is a concave function and the 
problem becomes one of minimizing a concave function 
over a convex set. This solution might admit local 
minima and standard convex minimization techniques 
cannot be used to solve this problem. We relax this prob- 
lem and investigate minimizing the number of network 
coding operations and minimizing the number of packets 
involved in network coding in the following problems. 

2) Number of network coding operations. In this problem 
we investigate minimizing the number of network coding 
operations at a node v. From Theorem [3] it follows that 
network coding operations (linear encoding of packets) 
need to be performed corresponding to each ri%. Thus 
the number of network coding operations at node v is 

n". We define this quantity as the amount of network 
coding. Thus, the cost function in this problem is given 

b y T,vevT,i n i- 

3) Number of packets involved in network coding. In 

this problem we investigate minimizing the number 
of packets over which network coding is performed 
at a node v. This is particularly relevant in optical 
networks when a conversion from optical signals to 
electrical signals is involved in order to encode the 
packets. We conjecture that the cost function is given 

by £v 6 v£i«u*(4N0) where A l = £i:P (e Q>i - 

T.jiu Q(iQj Q=p t ( n ) - maacji AJij). \ id repre- 
sents the number of packets meant for receivers Pi that 
participate in network coding and that are obtained by 
routing packets meant for Uq 6 q Q (Pi € Qj). From the 
definition it follows that < Ajj < rj. 

4) Minimum resource cost. In the setup considered in [5], 
each edge e is associated with a cost function f e (z e ) 
when the data rate on e is z e . The net cost associated 
with the network is then given by ^ e f e (z e ). This cost 
was minimized over the set of equations specified by 



equations (1) and (2) in [5], The same approach can 
be applied in the setting where only certain nodes are 
allowed to perform network coding. The restriction that 
a node v can perform only routing can be imposed by 
further constraining the equations in (|9} by n\ — for 
all i. 

5) Maximum rate. The problem conidered here is one of 
maximizing h constrained to (|9) and additionally the 
set of equations nf = for all i and nodes v which are 
restricted to routing. 

Note that the problems 2, 3, 4 (if the cost function / e () is 
linear) and 5 are linear problems and can be solved by standard 
linear programming approaches. It remains to be investigated 
if the decentralized subgradient optimization suggested in [5] 
can be applied to these problems. To the end of providing 
decentralized solutions to these linear problem, we consider 
the approach suggested by [5] in which a linear function ax 
is approximated by a stricly convex function {ax) l+a where 
a > is chosen small enough for a valid approximation. This 
makes the problem a convex optimization problem which can 
be solved in a decentralized manner by a modified version of 
the primal-dual algorithm used in [5], We do not prove this 
due to lack of space. The main idea in the proof is to show that 
the edge and node constraints involved are local in the sense 
of involving variables of the neighbouring edges or nodes and 
then follow the same steps as used in [5]. 

Problem 4 is a convex optimization problem if the function 
/ e () is convex. If we further assume that the function / e () 
is strictly convex, it follows that problem 4 admits a unique 
solution. Further, it can be shown that the primal-dual algo- 
rithm used in [5] can be modified to solve problem 4 in a 
decentralized manner. Again we do not prove this due to lack 
of space. 

V. Conclusion 

In this paper, we presented a new Information flow model 
to represent multicast flows. Using this model we set up 
optimization problems and presented distributed algorithms to 
minimize costs like number of packets undergoing network 
coding and amount of network coding. We also showed that 
this approach can be used to minimize network costs like link 
usage when some nodes are restricted to routing. 
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